RDFPath: Path Query Processing on Large RDF Graphs with MapReduce

نویسندگان

  • Martin Przyjaciel-Zablocki
  • Alexander Schätzle
  • Thomas Hornung
  • Georg Lausen
چکیده

The MapReduce programming model has gained traction in different application areas in recent years, ranging from the analysis of log files to the computation of the RDFS closure. Yet, for most users the MapReduce abstraction is too low-level since even simple computations have to be expressed as Map and Reduce phases. In this paper we propose RDFPath, an expressive RDF path query language geared towards casual users that benefits from the scaling properties of the MapReduce framework by automatically transforming declarative path queries into MapReduce jobs. Our evaluation on a real world data set shows the applicability of RDFPath for investigating typical graph properties like shortest paths.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalable RDF Graph Querying Using Cloud Computing

With the explosion of the semantic web technologies, conventional SPARQL processing tools do not scale well for large amounts of RDF data because they are designed for use on a single-machine context. Several optimization solutions combined with cloud computing technologies have been proposed to overcome these drawbacks. However, these approaches only consider the SPARQL Basic Graph Pattern pro...

متن کامل

Path Query Processing on Very Large RDF Graphs

Finding the shortest path between two nodes in an RDF graph is a fundamental operation that allows to discover complex relationships between entities. In this paper we consider the path queries over graphs from a database perspective. We provide the full-fledge database solution to execute path queries over very large RDF graphs. We present low-level techniques to speed-up shortest paths algori...

متن کامل

Distributed Storage and Query of Large RDF Graphs

RDF tuples are the building blocks of the semantic web. As more data are expressed as RDF tuples, storage capabilities become important. The data set will become increasingly large such that it is necessary for data to be stored across multiple machines. Data set will be partitioned into smaller subsets, each containing an incomplete picture about data relationships. This has implications for q...

متن کامل

PigSPARQL: A SPARQL Query Processing Baseline for Big Data

In this paper we discuss PigSPARQL, a competitive yet easy to use SPARQL query processing system on MapReduce that allows adhoc SPARQL query processing on large RDF graphs out of the box. Instead of a direct mapping, PigSPARQL uses the query language of Pig, a data analysis platform on top of Hadoop MapReduce, as an intermediate layer between SPARQL and MapReduce. This additional level of abstr...

متن کامل

An Effective and Efficient MapReduce Algorithm for Computing BFS-Based Traversals of Large-Scale RDF Graphs

Nowadays, a leading instance of big data is represented by Web data that lead to the definition of so-called big Web data. Indeed, extending beyond to a large number of critical applications (e.g., Web advertisement), these data expose several characteristics that clearly adhere to the well-known 3V properties (i.e., volume, velocity, variety). Resource Description Framework (RDF) is a signific...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011